---
title: "Homework 8 R Exercise"
output:
  pdf_document: default
  html_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

We will study power and robustness with a couple simulation exercises.  

In the first exercise, we will compare the power of a two-sample pooled t test to the Wilcoxon rank sum test.  To start, we will generate 1000 sets of two independent *normal* random samples with different means but equal variances (the t-test's home turf), and compare the proportion of times the two tests reject $H_0$.  The Wilcoxon tests take a little time to carry out; do not be concerned if your results are not generated right away.  Which test had the highest power (proportion of rejections) for you?


```{r eval=FALSE}
#Generate 1000 samples of size 10 for two different
#groups, with different means, but the same variance
X1_values=matrix(rnorm(10000,mean=1,sd=1),ncol=10)
X2_values=matrix(rnorm(10000,mean=0,sd=1),ncol=10)
#Run pooled t-tests and rank sum tests on each of 1000 samples
TTest=apply(X1_values,1,t.test,y=X2_values,var.equal=TRUE)
WTest=apply(X1_values,1,wilcox.test,y=X2_values)
#Store t-test p-values and Wilcoxon p-values
T_pvalues=as.numeric(unlist(TTest)[names(unlist(TTest))=="p.value"])
W_pvalues=as.numeric(unlist(WTest)[names(unlist(WTest))=="p.value"])
#Proportion of cases out of 1000 for which we rejected H_0.
sum(T_pvalues<0.05)/1000
sum(W_pvalues<0.05)/1000

```
We can repeat the same exercise, but replace the normal random samples with Cauchy random samples.  The Cauchy distribution is a heavy-tailed distribution for which the t-test will behave poorly.  Note that the Cauchy distribution does not have a mean and standard deviation, but does have location and scale parameters, which we specify instead.  Which test had the higher power here?
```{r eval=FALSE}
#Generate 1000 samples of size 10 for two different
#groups, with different means, but the same variance
X1_values=matrix(rcauchy(10000,location=1,scale=1),ncol=10)
X2_values=matrix(rcauchy(10000,location=0,scale=1),ncol=10)
#Run pooled t-tests and rank sum tests on each of 1000 samples
TTest=apply(X1_values,1,t.test,y=X2_values,var.equal=TRUE)
WTest=apply(X1_values,1,wilcox.test,y=X2_values)
#Store t-test p-values and Wilcoxon p-values
T_pvalues=as.numeric(unlist(TTest)[names(unlist(TTest))=="p.value"])
W_pvalues=as.numeric(unlist(WTest)[names(unlist(WTest))=="p.value"])
#Count number of cases out of 1000 for which we rejected H_0.
sum(T_pvalues<0.05)/1000
sum(W_pvalues<0.05)/1000

```